This vignette will show you how the telprep package can be used to automatically process data from telemetry flights. The program is designed to read in raw txt files, filter out erroneous signals, determine the living status of fish, and create some pretty pictures. This tutorial highlights the general workflow of the program. For a finer scale description of the program functionalities, see the help files for the following functions: read_flight_data, channels_merge, replace_date, combine_data, get_date_bins, rm_land_detects, get_best_detections, get_locations, flag_dead_fish, hmm_survival, and make_plot.
Begin by sticking the raw txt files into a folder. Txt files from multiple flights can and should be included in the folder. When naming the txt files, the flight grouping, flight date, and the receiver location (belly/wing) should be included (eg: F1_1-26-19_Belly.TXT) as this information will be used later on. Find the directory of the folder (my example files are in “D:/Jordy/flight-data/”) and store the coordinate reference systems of the txt files (the proj4string) in the variable crs_in. The function read_flight_data can now be used to read the raw data into R and store it in the variable raw_data:
directory <- "D:/Jordy/flight-data/"
crs_in <- "+proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0"
raw_data <- read_flight_data(directory, crs_in)
Structurally, raw_data is a list of data.frames (there is one for each txt file in the folder). To see how these data.frames are ordered, run
names(raw_data)
## [1] "tburb_f1_1-26-19-BELLY.TXT"
## [2] "tburb_f1_1-26-19-WING.TXT"
## [3] "tburb_f1_1-27-19-BELLY.TXT"
## [4] "tburb_f1_1-27-19-WING.TXT"
## [5] "tburb_f2_2-6-19-BELLY.TXT"
## [6] "tburb_f2_2-6-19-WING.TXT"
## [7] "tburb_f2_2-7-19-BELLY.TXT"
## [8] "tburb_f2_2-7-19-WING.TXT"
## [9] "tburb_f3_5-19-19-BELLY.TXT"
## [10] "tburb_f3_5-19-19-WING.TXT"
## [11] "tburb_f3_5-20-19-BELLY.TXT"
## [12] "tburb_f3_5-20-19-WING.TXT"
## [13] "tburb_f4_July-19-BELLY.TXT"
## [14] "tburb_f4_July-19-WING.TXT"
## [15] "tburb_f5_10-17-19-BELLY.TXT"
## [16] "tburb_f5_10-17-19-WING.TXT"
## [17] "tburb_f5_10-7-19-BELLY.TXT"
## [18] "tburb_f5_10-7-19-WING.TXT"
## [19] "tburb_f6_Dec-19-BELLY.TXT"
## [20] "tburb_f6_Dec-19-WING.TXT"
## [21] "tburb_f7_Jan-20-AG-1stFL-LEFT ANTENNA-WING.TXT"
## [22] "tburb_f7_Jan-20-AG-1stFL-RIGHT ANTENNA-BELLY.TXT"
## [23] "tburb_f7_Jan-20-AG-2ndFL-LEFT ANTENNA-WING.TXT"
## [24] "tburb_f7_Jan-20-AG-2ndFL-RIGHT ANTENNA-BELLY.TXT"
## [25] "tburb_f8_Feb-20-BELLY.TXT"
## [26] "tburb_f8_Feb-20-WING.TXT"
The contents of “tburb_f1_1-26-19-WING.TXT” are stored in the 2nd data.frame in the list.
If a channel or date was misprogrammed, it needs to be corrected before the contents of raw_data can be combined into a single data.frame.
Suppose that channel 3 was misprogrammed as channel 10 in “tburb_f1_1-26-19-WING.TXT”. To peek at the data run
head(raw_data[[2]])
## DateTime Channel TagID Antenna Power Y X Status
## 1 2019-01-25 14:10:48 35 80 AH0 120 7203409 749053.1 Active
## 3 2019-01-26 09:30:09 10 42 AH0 111 7192887 747589.6 Mort
## 4 2019-01-26 09:30:21 10 42 AH0 124 7192611 748175.1 Mort
## 5 2019-01-26 09:30:46 10 42 AH0 129 7192046 749351.5 Mort
## 6 2019-01-26 09:30:58 10 42 AH0 97 7191769 749951.0 Mort
## 7 2019-01-26 09:31:10 10 42 AH0 61 7191478 750554.0 Mort
To replace channel 10 with channel 3, the following line of code is run:
raw_data[[2]] <- channels_merge(raw_data[[2]], 10, 3)
head(raw_data[[2]])
## DateTime Channel TagID Antenna Power Y X Status
## 1 2019-01-25 14:10:48 35 80 AH0 120 7203409 749053.1 Active
## 3 2019-01-26 09:30:09 3 42 AH0 111 7192887 747589.6 Mort
## 4 2019-01-26 09:30:21 3 42 AH0 124 7192611 748175.1 Mort
## 5 2019-01-26 09:30:46 3 42 AH0 129 7192046 749351.5 Mort
## 6 2019-01-26 09:30:58 3 42 AH0 97 7191769 749951.0 Mort
## 7 2019-01-26 09:31:10 3 42 AH0 61 7191478 750554.0 Mort
Suppose that a date was misprogrammed in the file “tburb_f5_10-17-19-WING.TXT”. To peek at the data, run
head(raw_data[[16]])
## DateTime Channel TagID Antenna Power Y X Status
## 1 2003-04-25 08:14:11 40 30 A0 49 7203430 749046.0 Active
## 3 2003-04-25 08:14:21 40 26 A0 65 7203428 749045.4 Active
## 5 2003-04-25 08:14:23 40 30 A0 52 7203428 749045.6 Active
## 7 2003-04-25 08:14:24 40 25 A0 49 7203428 749045.7 Active
## 11 2003-04-25 08:14:32 40 57 A0 49 7203427 749045.6 Mort
## 12 2003-04-25 08:14:33 40 31 A0 69 7203427 749045.7 Active
If the correct flight date was “10/17/19”, the following line of code will make the correction:
raw_data[[16]] <- replace_date(raw_data[[16]], new_date ="10/17/19")
head(raw_data[[16]])
## DateTime Channel TagID Antenna Power Y X Status
## 1 2019-10-17 08:14:11 40 30 A0 49 7203430 749046.0 Active
## 3 2019-10-17 08:14:21 40 26 A0 65 7203428 749045.4 Active
## 5 2019-10-17 08:14:23 40 30 A0 52 7203428 749045.6 Active
## 7 2019-10-17 08:14:24 40 25 A0 49 7203428 749045.7 Active
## 11 2019-10-17 08:14:32 40 57 A0 49 7203427 749045.6 Mort
## 12 2019-10-17 08:14:33 40 31 A0 69 7203427 749045.7 Active
The function combine_data combines all of the data stored in raw_data into a single data.frame.
An argument (source_vec) is provided so that the source of each txt file can be specified. The argument can, for instance, be used to specify whether a receiver was located on the belly or the wing of the aircraft. Because 26 txt files were contained in the example folder, a vector of length 26 is used to encode this information:
names(raw_data)
## [1] "tburb_f1_1-26-19-BELLY.TXT"
## [2] "tburb_f1_1-26-19-WING.TXT"
## [3] "tburb_f1_1-27-19-BELLY.TXT"
## [4] "tburb_f1_1-27-19-WING.TXT"
## [5] "tburb_f2_2-6-19-BELLY.TXT"
## [6] "tburb_f2_2-6-19-WING.TXT"
## [7] "tburb_f2_2-7-19-BELLY.TXT"
## [8] "tburb_f2_2-7-19-WING.TXT"
## [9] "tburb_f3_5-19-19-BELLY.TXT"
## [10] "tburb_f3_5-19-19-WING.TXT"
## [11] "tburb_f3_5-20-19-BELLY.TXT"
## [12] "tburb_f3_5-20-19-WING.TXT"
## [13] "tburb_f4_July-19-BELLY.TXT"
## [14] "tburb_f4_July-19-WING.TXT"
## [15] "tburb_f5_10-17-19-BELLY.TXT"
## [16] "tburb_f5_10-17-19-WING.TXT"
## [17] "tburb_f5_10-7-19-BELLY.TXT"
## [18] "tburb_f5_10-7-19-WING.TXT"
## [19] "tburb_f6_Dec-19-BELLY.TXT"
## [20] "tburb_f6_Dec-19-WING.TXT"
## [21] "tburb_f7_Jan-20-AG-1stFL-LEFT ANTENNA-WING.TXT"
## [22] "tburb_f7_Jan-20-AG-1stFL-RIGHT ANTENNA-BELLY.TXT"
## [23] "tburb_f7_Jan-20-AG-2ndFL-LEFT ANTENNA-WING.TXT"
## [24] "tburb_f7_Jan-20-AG-2ndFL-RIGHT ANTENNA-BELLY.TXT"
## [25] "tburb_f8_Feb-20-BELLY.TXT"
## [26] "tburb_f8_Feb-20-WING.TXT"
source_vec <- c(rep(c("belly","wing"), 10), rep(c("wing","belly"),2), c("belly","wing"))
source_vec
## [1] "belly" "wing" "belly" "wing" "belly" "wing" "belly" "wing" "belly"
## [10] "wing" "belly" "wing" "belly" "wing" "belly" "wing" "belly" "wing"
## [19] "belly" "wing" "wing" "belly" "wing" "belly" "belly" "wing"
Now that the source of the data has been specified, the function combine_data can be used to combine the contents of raw_data:
all_data <- combine_data(raw_data, source_vec)
head(all_data)
## DateTime Channel TagID Antenna Power Y X Status
## 1100 2019-01-25 14:10:48 35 80 AH0 120 7203409 749053.1 Active
## 1 2019-01-26 09:15:01 63 19 AH0 43 7195053 742058.9 Active
## 2 2019-01-26 09:15:03 63 74 AH0 42 7195024 742140.0 Active
## 8 2019-01-26 09:15:52 63 3 AH0 46 7194144 744417.7 Active
## 16 2019-01-26 09:16:53 10 42 AH0 120 7192957 747439.5 Mort
## 19 2019-01-26 09:17:17 10 42 AH0 119 7192399 748614.9 Mort
## Source
## 1100 wing
## 1 belly
## 2 belly
## 8 belly
## 16 belly
## 19 belly
In order to use the telprep package, a SpatialLinesDataFrame representation of the river system must be imported into R. The following lines of code can be used to import a shapefile (named example.shp) as a SpatialLinesDataFrame object using the readOGR function from the rgdal package:
setwd("D:/Jordy/telprep/telprep/data/sf")
sldf <- rgdal::readOGR("example.shp")
## OGR data source with driver: ESRI Shapefile
## Source: "D:\Jordy\telprep\telprep\data\sf\example.shp", layer: "example"
## with 1 features
## It has 1 fields
The coordinate reference system of the geographic data must match that of the detection data. If the coordinate reference systems do not match, the following line of code will convert the coordinate reference system of the geographic data to that of the detection data.
sldf <- sp::spTransform(sldf, attr(all_data, "crs"))
The riverdist package is used internally for calculations related to river proximity. To use the functionality of this package, sldf (from Step 4) must be converted into a river_network object. The riverdist function line2network can be used to make the conversion (the user is referred to the riverdist package for help).
river_net <- riverdist::line2network(sp=sldf, tolerance = 500)
##
## Units: m
##
## Removed 1 duplicate segments.
##
## Removed 90 segments with lengths shorter than the connectivity tolerance.
The function rm_land_detects can be used to discard the detections that occurred away from the river system. To remove the detections that occurred more than 500 m away from a river channel and store the data in a variable called river_detects, the following line of code is run:
river_detects <- rm_land_detects(all_data, river_net, dist_thresh = 500)
## [1] "be patient -- this could take a few minutes"
The function get_best_locations can be used to determine the best location for each fish in each detection period. In short, the best location is considered to be the location where the highest power detection occurred during each flight period. The date_bins argument of the get_best_locations function specifies the start and end dates of the detection periods. These dates can be found and formatted using the get_date_bins function:
names(raw_data)
## [1] "tburb_f1_1-26-19-BELLY.TXT"
## [2] "tburb_f1_1-26-19-WING.TXT"
## [3] "tburb_f1_1-27-19-BELLY.TXT"
## [4] "tburb_f1_1-27-19-WING.TXT"
## [5] "tburb_f2_2-6-19-BELLY.TXT"
## [6] "tburb_f2_2-6-19-WING.TXT"
## [7] "tburb_f2_2-7-19-BELLY.TXT"
## [8] "tburb_f2_2-7-19-WING.TXT"
## [9] "tburb_f3_5-19-19-BELLY.TXT"
## [10] "tburb_f3_5-19-19-WING.TXT"
## [11] "tburb_f3_5-20-19-BELLY.TXT"
## [12] "tburb_f3_5-20-19-WING.TXT"
## [13] "tburb_f4_July-19-BELLY.TXT"
## [14] "tburb_f4_July-19-WING.TXT"
## [15] "tburb_f5_10-17-19-BELLY.TXT"
## [16] "tburb_f5_10-17-19-WING.TXT"
## [17] "tburb_f5_10-7-19-BELLY.TXT"
## [18] "tburb_f5_10-7-19-WING.TXT"
## [19] "tburb_f6_Dec-19-BELLY.TXT"
## [20] "tburb_f6_Dec-19-WING.TXT"
## [21] "tburb_f7_Jan-20-AG-1stFL-LEFT ANTENNA-WING.TXT"
## [22] "tburb_f7_Jan-20-AG-1stFL-RIGHT ANTENNA-BELLY.TXT"
## [23] "tburb_f7_Jan-20-AG-2ndFL-LEFT ANTENNA-WING.TXT"
## [24] "tburb_f7_Jan-20-AG-2ndFL-RIGHT ANTENNA-BELLY.TXT"
## [25] "tburb_f8_Feb-20-BELLY.TXT"
## [26] "tburb_f8_Feb-20-WING.TXT"
flight_group <- c(1,1,1,1,2,2,2,2,3,3,3,3,4,4,5,5,5,5,6,6,7,7,7,7,8,8)
date_bins <- get_date_bins(raw_data, flight_group)
date_bins
## [,1] [,2]
## [1,] "01/25/19" "01/27/19"
## [2,] "01/28/19" "02/07/19"
## [3,] "05/19/19" "05/24/19"
## [4,] "07/24/19" "07/27/19"
## [5,] "08/05/19" "10/17/19"
## [6,] "12/16/19" "12/18/19"
## [7,] "01/15/20" "01/21/20"
## [8,] "02/05/20" "02/08/20"
After the detection periods have been specified, the function get_best_locations can be used to determine the locations the fish:
best_locations <- get_best_locations(river_detects, date_bins = date_bins, bin_by=NA, n_thresh = 5, dist_max = 5000, remove_flagged = F)
head(best_locations$all_detects)
## DateTime Channel TagID Antenna Power Y X Status
## 1 2019-01-26 09:15:01 63 19 AH0 43 7195053 742058.9 Active
## 2 2019-01-26 09:15:03 63 74 AH0 42 7195024 742140.0 Active
## 107 2019-01-26 09:44:04 82 2 AH0 50 7144580 833903.8 Active
## 11 2019-01-26 09:46:04 3 47 AH0 105 7164242 795278.5 Mort
## 11100 2019-01-26 09:46:04 10 47 AH0 105 7164242 795278.5 Mort
## 1610 2019-01-26 09:46:37 3 47 AH0 132 7165018 797033.3 Mort
## Source BestSignal FlightNum Dist Records
## 1 belly TRUE 1 0.000 1
## 2 belly FALSE 1 327.404 2
## 107 belly FALSE 1 289.790 3
## 11 wing FALSE 1 2.877 7
## 11100 wing FALSE 1 3.414 10
## 1610 wing FALSE 1 1.425 7
head(best_locations$best_detects)
## DateTime Channel TagID Power Y X Status Source
## 1 2019-01-26 09:15:01 63 19 43 7195053 742058.9 Active belly
## 134 2019-01-26 09:48:45 82 10 52 7136212 849867.7 Active belly
## 56 2019-01-26 11:01:53 35 85 162 7039969 1059337.4 Active wing
## 192 2019-01-26 11:03:04 10 83 183 7030325 1072930.3 Active belly
## 193 2019-01-26 11:03:05 22 84 121 7030359 1072892.3 Active belly
## 202 2019-01-26 11:05:36 34 82 157 7034868 1066905.4 Active belly
## FlightNum Records flag
## 1 1 1 TRUE
## 134 1 1 TRUE
## 56 1 11 FALSE
## 192 1 13 FALSE
## 193 1 1 TRUE
## 202 1 1 TRUE
**best_locations$all_detects* adds some useful to all_data: -BestSignal is the signal with the highest power in a detection period -Dist is the Euclidean distance (in km) between the detection location and the associated highest power detection -FlightNum is the detection period -Records is number of times that a fish was detected in a detection period.
**all_detects$best_detects* contains the highest power detections only. -These detections are flagged if there are fewer than n_thresh detections within a distance of dist_max km from the best detection during the detection period. -The detection will also be flagged if a positive linear relationship exists between Power and Dist for all detections within dist_max km from the best signal in the detection period (i.e. the signal strength increases as the best detection is approached).
Two functions (flag_dead_fish and hmm_survival) are provided to help determine the living status of fish.
flag_dead_fish uses locational information to determine which fish have expired. If a fish moves less than dist_thresh km for all consecutive detection periods following a detection, the fish will be flagged as dead. The following lines of code will flag for dead fish using this approach:
best_detects <- best_locations$best_detects
head(best_detects)
## DateTime Channel TagID Power Y X Status Source
## 1 2019-01-26 09:15:01 63 19 43 7195053 742058.9 Active belly
## 134 2019-01-26 09:48:45 82 10 52 7136212 849867.7 Active belly
## 56 2019-01-26 11:01:53 35 85 162 7039969 1059337.4 Active wing
## 192 2019-01-26 11:03:04 10 83 183 7030325 1072930.3 Active belly
## 193 2019-01-26 11:03:05 22 84 121 7030359 1072892.3 Active belly
## 202 2019-01-26 11:05:36 34 82 157 7034868 1066905.4 Active belly
## FlightNum Records flag
## 1 1 1 TRUE
## 134 1 1 TRUE
## 56 1 11 FALSE
## 192 1 13 FALSE
## 193 1 1 TRUE
## 202 1 1 TRUE
best_detects <- best_detects[best_detects$flag==F,]
flagged_fish <- flag_dead_fish(best_detects, dist_thresh = 0.5)
head(flagged_fish)
## DateTime Channel TagID Power Y X Status Source
## 56 2019-01-26 11:01:53 35 85 162 7039969 1059337 Active wing
## 192 2019-01-26 11:03:04 10 83 183 7030325 1072930 Active belly
## 69 2019-01-26 11:14:41 35 84 152 7026735 1076430 Mort wing
## 74 2019-01-26 11:15:03 35 83 169 7027807 1075850 Active wing
## 91 2019-01-26 11:16:30 3 83 155 7030738 1072477 Active wing
## 1021 2019-01-26 11:17:54 3 82 171 7033727 1069361 Active wing
## FlightNum Records flag MoveDist MortFlag
## 56 1 11 FALSE NA No
## 192 1 13 FALSE NA No
## 69 1 10 FALSE NA No
## 74 1 12 FALSE NA No
## 91 1 11 FALSE NA No
## 1021 1 7 FALSE NA No
hmm_survival operates similarly to flag_dead_fish; however, this function uses a more sophisticated method to determine the living status of the fish. Briefly, this function utilizes locational and mortality sensor related information to determine the most likely path of survival states (called the viterbi path) for each fish using a Hidden Markov Model (HMM). A benefit to using a HMM based approach is that detection probabilities and survival rates are estimated using a statistical approach. A detailed description of the HMM can be found by running vignette(“hmm”) in the console.
The following lines of code will fit the HMM to determine the living status of the fish:
library(msm)
hmm_out <- hmm_survival(best_detects)
hmm_out$results
## [[1]]
## estimate lower upper
## annual survival rate 0.8596032 0.8184353 0.8928571
## annual mortality rate 0.1403968 0.1071429 0.1815647
##
## [[2]]
## estimate lower upper
## detection probability live fish 0.2995275 0.2459589 0.3592053
## detection probability expired fish 0.9256420 0.6319511 0.9890412
##
## [[3]]
## [,1]
## [1,] "the mortality signals work for live fish approximately 47 percent of the time"
## [2,] "the mortality signals work for expired fish approximately 97 percent of the time"
head(hmm_out$viterbi)
## DateTime Channel TagID Power Y X Status Source
## 56 2019-01-26 11:01:53 35 85 162 7039969 1059337 Active wing
## 192 2019-01-26 11:03:04 10 83 183 7030325 1072930 Active belly
## 69 2019-01-26 11:14:41 35 84 152 7026735 1076430 Mort wing
## 74 2019-01-26 11:15:03 35 83 169 7027807 1075850 Active wing
## 91 2019-01-26 11:16:30 3 83 155 7030738 1072477 Active wing
## 1021 2019-01-26 11:17:54 3 82 171 7033727 1069361 Active wing
## FlightNum Records flag Viterbi
## 56 1 11 FALSE 1
## 192 1 13 FALSE 1
## 69 1 10 FALSE 1
## 74 1 12 FALSE 1
## 91 1 11 FALSE 1
## 1021 1 7 FALSE 1
In the column hmm_out$viterbi$Viterbi, a value of 1 cooresponds to the event that the fish is alive whereas 2 cooresponds to the event that the fish has expired.
A plotting function (make_plot) is included in the telprep package. This function is designed to be used throughout the analysis. Example of how the function can be used are provided here:
# basic plot
par(mfrow=c(1,1))
make_plot(sldf, best_detects)
# darken background
make_plot(sldf, best_detects, darken=2.5)
# change style of background
make_plot(sldf, best_detects, type="esri-topo")
# give each fish a unique color preserved through flights
par(mfrow=c(3,1))
make_plot(sldf, best_detects, col_by_fish=T, flight=1, darken=2.5)
make_plot(sldf, best_detects, col_by_fish=T, flight=2, darken=2.5)
make_plot(sldf, best_detects, col_by_fish=T, flight=3, darken=2.5)
# to plot the locations for a single fish
par(mfrow=c(1,1))
make_plot(sldf, best_detects, channel=10, tag_id=11, darken=2.5)
# to zoom in to a specified extent
extent <- c(x_min=466060, x_max=1174579, y_min=6835662, y_max=7499016)
make_plot(sldf, best_detects, extent, darken=2.5)
# plotting live and dead fish by flight period -- green fish have expired
par(mfrow=c(3,1))
make_plot(sldf, viterbi, type="bing", darken=2.5, viterbi=T, flight=1)
make_plot(sldf, viterbi, type="bing", darken=2.5, viterbi=T, flight=3)
make_plot(sldf, viterbi, type="bing", darken=2.5, viterbi=T, flight=5)